Mutual information between in- and output trajectories of biochemical networks 
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Biochemical networks can respond to temporal characteristics of time-varying signals. To under- 
stand how reliably biochemical networks can transmit information we must consider how an input 
signal as a function of time — the input trajectory — can be mapped onto an output trajectory. Here 
we estimate the mutual information between in- and output trajectories using a Gaussian model. 
We study how reliably the chemotaxis network of E. coli can transmit information on the ligand con- 
centration to the flagellar motor, and find the input power spectrum that maximizes the information 
transmission rate. 



Cells continually have to respond to a wide range of 
intra- and extracellular signals. These signals have to 
be detected, encoded, transmitted and decoded by bio- 
chemical networks. In the absence of biochemical noise, 
a particular input signal will lead to a unique output sig- 
nal, allowing the cell to respond appropriately. Recent 
experiments, however, have vividly demonstrated that 
biochemical networks can be highly stochastic [1], and a 
key question is therefore how reliably biochemical net- 
works can transmit information in the presence of noise. 

To address this question, we must recognize that the 
message may be contained in the temporal dynamics of 
the input signal. A well-known example is bacterial 
chemotaxis, where the concentration of the intracellu- 
lar messenger protein depends not on the current ligand 
concentration, but rather on whether this concentration 
has changed in the recent past 0] — the response of the 
network thus depends on the history of the input sig- 
nal. Moreover, the input signal may be encoded into the 
temporal dynamics of the signal transduction pathway. 
For example, stimulation of the rat PC-12 system with a 
neuronal growth factor gives rise to a sustained response 
of the Raf-Mek-Erk pathway, while stimulation with an 
epidermal growth factor leads to a transient response [H . 
In all these cases, the message is encoded not in the con- 
centration of some chemical species at a specific moment 
in time, but rather in its concentration as a function of 
time. Importantly, whether the processing network can 
reliably respond to a signal depends not only on the in- 
stantaneous value of the signal, but also on the time scale 
over which it changes. In general, the in- and output sig- 
nals of biochemical networks are time-continuous signals 
with non-zero correlation times. To understand how reli- 
ably biochemical networks can transmit information, we 
need to know how accurately an input signal as a func- 
tion of time — the input trajectory — can be mapped onto 
an output trajectory. In this article, we take an informa- 
tion theoretic approach to this question. 

A natural measure for the quality of information 
transmission is the mutual information between the in- 
put signal / and the network response O, given by 
M(I,0) = H{0) - H{0\I) 4]. Here, H(0) = 



— J dOp(0) log p(0), with p(0) the probability distribu- 
tion of O, is the information entropy of the output O; 
H{0\I) = - J dlp(I) J dOp{0\I) log p(0\I) is the aver- 
age (over inputs I) information entropy of O given J, 
with p(0\I) the conditional probability distribution of O 
given I. Recently, the mutual information between the 
instantaneous values of the in- and output signals of bio- 
chemical networks has been investigated although 
in these studies the temporal correlations in the input 
signals were ignored. Here we investigate the mutual in- 
formation between in- and output trajectories. 

Mutual information between trajectories — We consider 
a biochemical network in steady state which has one in- 
put species S with copy number S and one output species 
X with copy number X. The mutual information between 
in- and output trajectories is found by taking the possi- 
ble input and output signals / and O to be the possible 
trajectories S(t) and X(t): 



M(S,X) = J VS(t) J VX(t)p(S(t),X(t))logj 



p(S(t),X(t)) 



is(t))p(x(t)y 
(i) 

Calculating the mutual information between trajecto- 
ries is in general a formidable task, given the high- 
dimensionality of the trajectory space. However, for a 
Gaussian model, which we will employ here, the mutual 
information can be obtained analytically. 

In this Gaussian model, it is assumed that the input 
signal consists of small temporal variations around some 
steady-state value, obeying Gaussian statistics. This lim- 
its our approach, but seems a reasonable simplification 
given that the input statistics have not been measured for 
most, if not all, biological systems. Moreover, we assume 
that the coupling between the components can be lin- 
earized and that the intrinsic noise is small and Gaussian, 
according to the linear-noise approximation 0]; recent 
modeling studies have shown this gives a good descrip- 
tion of the noise properties of a large class of biochemical 
networks, even when the copy numbers are as low as ten 
@>@]. Under these assumptions the joint probability dis- 
tribution of the in- and output signals is described by a 
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multivariate Gaussian, 



p(v) = 



a well-known result for a time-continuous Gaussian chan- 
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The vector v = (s, x), with s = (s(ii), s(i 2 ), ■ ■ ■ , s(tjv)) 
constructed from the input signal sampled at times t = 
t\, . . . , tjy, and x = (x(ti), xfa), . . . , x(ijv)); and x(t) 
are the deviations of S and X away from their steady- 
state values, {S) and (X), respectively. The 2N x 2N 
covariance matrix Z has the form 



(^SX Q*XX 



(3) 



where C"' 3 is an iV x AT matrix with elements Cff = 
C a p(ti — tj) = (a(ti)(3(tj)). In the limit that the in- 
ane! output signals are time-continuous, the mutual in- 
formation rate between the in- and output trajectories 
i?(s,x) = limT^oo M(s,x)/T is given by [HJ 



R(s,x) = - 



1 
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where the power spectrum S a p{uj) is the Fourier trans- 
form of C a p(t). Measuring the output signal as a func- 
tion of time, R(s, x) is the rate at which the information 
on the input trajectory increases with time; importantly, 
R(s, x) takes into account temporal correlations in the 
in- and output signal. We emphasize that Eq. 2] is exact 
only for linear systems with Gaussian statistics. Impor- 
tantly, however, Eq. [1] can also be applied to systems 
which do not obey Gaussian statistics and to non-linear 
systems; in these cases it provides a lower bound on the 
channel capacity of the network [§] . 

A biochemical network differs from a channel in 
telecommunication or electronics, in that the reaction 
that detects the input signal may introduce correlations 
between the signal and the intrinsic noise of the reactions 
that constitute the processing network [8| ; these correla- 
tions are a consequence of the molecular character of the 
components and thus unique to (bio) chemical systems. 
If the detection reaction does not introduce correlations, 
then the power spectrum of the output signal, 5 xx (w), is 
given by the spectral addition rule [8[ : 



S xx (uj) = N(u) + g 2 (uj)S ss {uj). 



(5) 



Here, N(ui) is the intrinsic noise of the processing net- 
work, S ss (oj) is the power spectrum of the input sig- 
nal, and g 2 (oj) = |S sx (c<;)| 2 /S ss (u;) 2 is the frequency- 
dependent gain. Identifying the spectrum of the trans- 
mitted signal as P(uj) = g 2 (uj)S ss (uj), Eq. 2] can be 
rewritten as 
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nel . When the detection reaction does introduce cor- 
relations between the input signal and the noise of the 
processing network, one can still define g 2 {ui) and P(u>) 
as above and apply Eq. [5] However, in this case N(ui) 
and g 2 (tu) are not intrinsic properties of the network, but 
also depend on the statistics of the input signal. 

Network motifs — The three elementary detection mo- 
tifs shown in Table U [8] illustrate a number of character- 
istics of the transmission of trajectories. As a simple ex- 
ample of a time-continuous input signal with a non-zero 
correlation time, we take the dynamics of S to be a Pois- 
sonian birth-and-death process; for large copy numbers, 
this gives distributions that are approximately Gaussian. 

Motif I describes the reversible binding between, for 
example, a ligand and a receptor, or an enzyme and its 
substrate. For this motif only we take the input sig- 
nal to be the total number of both bound and unbound 
molecules S T (t) = S{t) + X(t). We find that this motif 
acts as a low-pass filter for information. Specifically, the 
gain-to-noise ratio g 2 (uj) / TV (uj) , which determines how 
accurately an input signal at frequency u can be trans- 
mitted, is approximately constant at low frequencies but 
decays as u>~ 2 for high frequencies. Since input signals 
of biochemical networks are commonly detected via this 
motif, this result suggests that high-frequency input sig- 
nals are typically not propagated reliably. 

Motif II describes the scenario in which the signaling 
molecule is deactivated upon detection. An important 
example is activation of membrane receptors by ligand 
binding followed by endocytosis. If the input signal is 
a Poissonian birth-and-death process, the mutual infor- 
mation between instantaneous values of S and X is zero 
(l6| — X gives no information about the current value of 
S. Indeed, to understand how cells can use this motif 
to transmit information, we must consider the mutual 
information between in- and output trajectories. Inter- 
estingly, for this motif N(lo) vanishes at high frequen- 
cies, while g 2 (co) approaches a constant value; the gain- 
to-noise ratio thus diverges at high frequencies, meaning 
that this motif can reliably transmit rapidly varying in- 
put signals [rH |. 

Motif III is a coarse-grained model for enzymatic 
reactions or gene activation; the enzyme-substrate or 
transcription-factor-DNA binding reaction, respectively, 
has been integrated out. For this motif, which in con- 
trast to the other two obeys the spectral addition rule 
(Eq. [SJ, g 2 (io) and N(oj) have the same functional de- 
pendence on uj. Hence, g 2 (ui)/N(ui) is independent of w, 
which means that this motif can transmit signals at all 
frequencies with the same fidelity. 

For both motifs II and III the mutual information be- 
tween trajectories does not depend on the deactivation 
rate \x of the read-out component X; g 2 (uj) and N(ui) 
depend in the same way on fx. The information on the 
input trajectory s(t) is encoded solely in the statistics of 
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TABLE I: Three elementary detection motifs. The input signal is modeled via — -» S and S — > 0. 



the production events of X; decays of X occur indepen- 
dently of S and hence provide no new information about 
S. These observations may suggest that if an input sig- 
nal is detected via one of these motifs, the deactivation 
rate of X is not important. However, if the information 
encoded in X needs to be transmitted to a downstream 
pathway, then this transmission rate will in general de- 
pend on /it. 

Recently, Endres and Wingreen [l2| have argued that 
detection motif II is superior to motif III in measuring 
average concentrations. Our analysis shows that motif 
II can also more reliably transmit information in time- 
varying signals, due to the more accurate transmission of 
high frequency components of the input. 

Bacterial chemotaxis — A classical example of a biolog- 
ical system in which not only the instantaneous value of 
the input signal is important, but also its history, is the 
chemotaxis system of Escherichia coli 0] • The messenger 
protein CheY is phosphorylated (CheY p ) by the kinase 
CheA and dephosphorylated by the phosphatase CheZ. 
The kinase activity is rapidly inhibited by receptor-ligand 
binding, allowing the system to respond to changes in lig- 
and concentration on short time scales. Receptor mcthy- 
lation slowly counteracts the effect of ligand binding on 
CheA activity, allowing the system to adapt to changes 
in ligand concentration on longer time scales. An open 
question is how this network processes the ligand signal 
in the presence of noise Here, we study how reli- 

ably the chemotaxis network can transmit information 
in time-varying input signals. 

Recently, Tu et al. have shown that a minimal model 
can accurately describe the response of the chemotaxis 
system to a wide range of time- varying input signals [3] • 
In this model it is assumed that receptor-ligand bind- 
ing and the kinase response are much faster than CheY p 
dephosphorylation and receptor (de)methylation; hence 
the kinase activity is in quasi-steady-state. Linearizing 
around steady-state, we obtain the following model [3]: 



a(t) = am(t) - (il{t) 
dm ait) , . 



dy 
dt 



ia(t)- — +V v (t)- 



(7) 
(8) 

(9) 



the fraction of active kinases and the receptor methy- 
lation level from their steady-state values; l(t) and y(t) 
are the fractional changes in the ligand and CheY p con- 
centrations relative to steady-state levels; r m and r 2 
are the time scales for receptor (de)methylation and 
CheYp dephosphorylation, with r TO > t z ; n m and r\ y 
are Gaussian white-noise sources that are independent 
of one another, and of the ligand signal: (r](t)) = 0; 

(¥*)*?(*')> = (v 2 )S(t-n {n m (t) % (t')) = (v(tW)) = o. 

The statistics of the input signal are described by the 
power spectrum Su(ui). This system obeys the spectral 
addition rule (Eq. [5]), and the power spectrum of y is 
given by 

Syyiu) = JW") + gU v Wu(u), (io) 

with Ni—>y(u>) and gf^ y (u>) being intrinsic properties of 
the chemotaxis network: 



(^ +T - 2 )( a; 2 + a 2 /r 2 i) 



(11) 



(12) 



Here, a(t) and m(t) are, respectively, the deviations of 



Eq. HH shows that the gain is small at low frequen- 
cies, due to adaptation of the kinase activity via receptor 
methylation 14| (Fig. QJi). This network is therefore 
unable to respond to low- frequency variations in the lig- 
and signal. As noted in [3,|l3|, the gain also decreases 
at high frequencies, due to the time taken for CheY p 
dephosphorylation by CheZ. However, we see that the 
noise also decreases with increasing frequency. In fact, 
at high frequencies the methylation dynamics can be ig- 
nored, and the dynamics of CheY p are approximately 
those of motif III, discussed above; gf^ y (uj) /Ni^ y (u) in- 
creases to a constant value showing that, in contrast to 
the conclusions of 14[, high-frequency signals can be re- 
liably encoded in the trajectory y(t). 

However, the ultimate response of this system is that 
of the flagellar motor. Binding of CheY p to the mo- 
tor increases the tendency of the motor to switch to the 
clockwise state, which causes the bacterium to "tumble" 
and change direction. Assuming that CheY p binding to 
the motor is fast, and that the motor response can be lin- 
earized, the clockwise bias of the motor b(t) is determined 
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where T& is the typical motor switching time and r]t> rep- 
resents Gaussian white noise, uncorrelated from rj m and 
rjy. Applying the spectral addition rule, the power spec- 
trum of the motor is S^uj) = N y ^i 1 (u>)+ g 2 ^ b (uj)S yy (uj), 
where N v ^i,(uj) = (ij 2 ) / (uj 2 + T b 2 ) is the intrinsic noise of 
the motor, and g 2 _> b {uj) — k 2 / (uj 2 +T b 2 ) is the frequency- 
dependent gain of the motor. Inserting S yy {uj) of Eq. 
[10] into this expression for Sbb(u), we see that the total 
noise added between the ligand and the motor is given by 
Ni^ b (uj) = N y ^ b (uj)+g 2 ^ b (uj)Ni^ y (uj), while the overall 
gain of the network is gf^ b (uj) = gf^ y (uj)g 2 ^ b (uj). 

Fig. [T|d shows that g 2 ^ b {oj) is large at frequencies 
T n% w ^ T z 1 ~ r &~ ) wnne Ni^,b(u) monotonically 
decreases with increasing frequency. Importantly, at 
high frequencies ui ^> t" 1 ,^ -1 , the gain g 2 ^ b {uj) ~ oj~ 4 
since both gf^yiuj) and g 2 ^ h {u) decrease as uj~ 2 , while 
Ni^b(u) ~ w -2 since the dominant noise contribution 
is the intrinsic noise of motor switching N y ^b(u). As a 
result, gf^ b (ui) /Ni^b(ui) scales as w~ 2 for high frequen- 
cies. Hence, while high-frequency fluctuations in l(t) are 
reliably encoded in the trajectory y(t), this information 
is not propagated to the motor. In essence, the high- 
frequency variations of l(t) are filtered by the slow dy- 
namics of CheYp dephosphorylation and motor switch- 
ing, and are therefore masked by the inevitable intrinsic 
noise of motor switching. 

The goal of the chemotaxis network is to determine 
whether the ligand concentration has increased or de- 
creased. This binary decision has to be made on the 
timescale of a motor switching event, which means that 
the network should recover at least one bit of infor- 
mation from the input trajectory over this timescale: 
R(l, b) > lbit/rfe = 2bits s _1 . Our results allow us to 
predict the input power spectrum Su(u>) that maximizes 
R(l, b) for a given power constraint er 2 , (see Fig. Q}:), 
which is peaked around u> « Is -1 . Fig. [TJi shows the 
corresponding optimal information rate as a function of 
of,, and suggests that to achieve i?(l, b) > 2bits s _1 a 
signal variance of at least af t ~ 2.5 is required. The pre- 
dicted form of the gain-to-noise ratio and the optimal 
input power spectrum could be tested by exposing E. 
coli cells to oscillating stimuli with different frequencies, 
for example in a microfluidic device, and measuring the 
(cross) power spectra of the motor bias and the stimulus. 

The input signal that a bacterium perceives depends 
not only on the spatio-temporal correlations of the ligand 
concentration in the environment but also on its swim- 
ming behavior, which in turn depends on the input sig- 
nal itself: as Fig. [TJd shows, E. coli is unable to reli- 
ably respond to high- (w » t- ~ 1 , 1 ) or low- frequency 
(uj <C I'm 1 ) stimuli. This means that, in order to find 
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FIG. 1: Information transmission in the E. coli chemotaxis 
network. The network gain g 2 (uj) (dashed line), noise N(uj) 
(dotted line) and g 2 (ui)/N(uj) (full line) are shown between 
(a) ligand and CheY p concentrations, and (b) ligand and 
motor bias, (c) Water filling approach for the optimal in- 
put signal [111 ]. The optimal power spectrum, subject to 
a total power constraint of; = Su(uj)duJ, is given by 

Su(lu) — L — Ni^t,(uj)/gf_^ b (uj), with L chosen such that the 
shaded area matches of. (d) R(l, b) evaluated numerically 
for different of values when the corresponding optimal input 
power spectrum is chosen. The following parameter values 
were used, estimated from [l3.[l5|; a = 2.7, (3 = 1.3, r m = 8s, 



lO^s" 1 , 7 = 8s-\ r z 
L , T b = 0.5s, (r) 2 ) =0.5s" 1 . 



0.5s, (r)y) = 0.002s" 



food, E. coli should swim neither too slowly nor too fast. 
Specifically, our predicted optimal input spectrum sug- 
gests that chemotaxis is most efficient when the spatio- 
temporal correlations of the ligand and the swimming 
speed of the bacterium are matched to give a typical fre- 
quency of the ligand signal of about us « Is -1 . Further 
work is needed to study whether nature has optimized 
this feedback between swimming and signaling, and to 
explore the naturally occurring chemoattractant distri- 
butions that E. coli would experience. 
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